Robust Control of Markov Decision Processes with Uncertain Transition Matrices

نویسندگان

  • Arnab Nilim
  • Laurent El Ghaoui
چکیده

Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov decision process, where uncertainty on the transition matrices is described in terms of possibly nonconvex sets. We show that perfect duality holds for this problem, and that as a consequence, it can be solved with a variant of the classical dynamic programming algorithm, the “robust dynamic programming” algorithm. We show that a particular choice of the uncertainty sets, involving likelihood regions or entropy bounds, leads to both a statistically accurate representation of uncertainty, and a complexity of the robust recursion that is almost the same as that of the classical recursion. Hence, robustness can be added at practically no extra computing cost. We derive similar results for other uncertainty sets, including one with a finite number of possible values for the transition matrices. We describe in a practical path planning example the benefits of using a robust strategy instead of the classical optimal strategy; even if the uncertainty level is only crudely guessed, the robust strategy yields a much better worst-case expected travel time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Markov Decision Processes with Uncertain Transition Matrices

Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov dec...

متن کامل

Robust Markov Decision Processes for Medical 1 Treatment Decisions

Medical treatment decisions involve complex tradeoffs between the risks and benefits of various treatment options. The diversity of treatment options that patients can choose over time and uncertainties in future health outcomes, result in a difficult sequential decision making problem. Markov decision processes (MDPs) are commonly used to study medical treatment decisions; however, optimal pol...

متن کامل

Robust Markov Decision Processes for Medical Treatment Decisions

Medical treatment decisions involve complex tradeoffs between the risks and benefits of various treatment options. The diversity of treatment options that patients can choose over time, and uncertainties in future health outcomes result in a difficult sequential decision making problem. Markov decision processes (MDPs) are commonly used to study medical treatment decisions; however, optimal pol...

متن کامل

A fuzzy treatment of uncertain Markov decision processes: Average case

In this paper, the uncertain transition matrices for inhomogeneous Markov decision processes are described by use of fuzzy sets. Introducing a ν-step contractive property, called a minorization condition, for the average case, we fined a Pareto optimal policy maximizing the average expected fuzzy rewards under some partial order. The Pareto optimal policies are characterized by maximal solution...

متن کامل

A Robust Approach to Markov Decision Problems with Uncertain Transition Probabilities

This paper considers a discrete-time infinite horizon discounted cost Markov decision problem in which the transition probability vector for each state-control pair is uncertain. A popular approach to this problem has been to find a policy that performs best in the worst-case scenario. A policy obtained in this manner, however, tends to be conservative. We construct a robust formulation for the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Operations Research

دوره 53  شماره 

صفحات  -

تاریخ انتشار 2005